Algorithmic trading, also known as quantitative trading consist of trading strategies based on quantitative analysis which rely on mathematical computations to identify trading opportunities. There are two biggest and most popular strategies in trading, those two are mean-reversion and trend following or often called a momentum strategy.
My momentum trading algorithm was built on the premise that big price expansion starts with compression, and it's best to follow the expansion until it starts to fade. I aim to use Linear Regression to find relationship between different features of my strategy and make an informed decision to fine tune my strategy with the goals of increasing profitability.
# data processing
import numpy as np
import pandas as pd
import datetime as dt
# visualization
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.graph_objects as go
from plotly.subplots import make_subplots
# model algorithm
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn import metrics
# loading ethusd price data from csv file
ethusd_data = pd.read_csv('ETHUSD-5m-data.csv', parse_dates=True)
# ethusd data columns and types
ethusd_data.info()
# ethusd data dimensionality
ethusd_data.shape
# ethusd data sample
ethusd_data.sample(5)
# check missing data
ethusd_data.isnull().sum()
I see no problem with vwap and lastSize missing value because I will only use timestamp and ohlc columns and delete the rest of unecessary columns. I also delete the first row(the only row with missing ohlc values) because even if I remove them I still have the flexibility to work with the rest of the data.
# dropping uncecessary columns
ethusd_data = ethusd_data.drop(columns=['trades','symbol','volume','vwap','lastSize','turnover','homeNotional','foreignNotional'])
# drop na value
ethusd_data.dropna(inplace=True)
I will change the order of the columns, then fix the datatype of timestamp from object into datetime. Because I see a 5 minutes discrepancy between the data and live chart data in BitMEX exchange, I will fix the timestamp column using timedelta.
# column ordering
ethusd_data = ethusd_data[['timestamp', 'open','high','low','close']]
# fixing timestamp dtypes, and adding time delta
ethusd_data['timestamp'] = pd.to_datetime(ethusd_data['timestamp'])
ethusd_data['timestamp'] = ethusd_data['timestamp'] + pd.Timedelta(minutes=-5)
ethusd_data = ethusd_data.reset_index(drop=True)
ethusd_data.info()
ethusd_data.shape
ethusd_data.sample(5)
# loading algorithm data from csv file
algo_data = pd.read_csv('algo.csv')
# algo data columns and types
algo_data.info()
# algo data dimensionality
algo_data.shape
# algo data sample
algo_data.sample(5)
# check missing data
algo_data.isnull().sum()
My algo requires rolling 200 value as its source, therefore the first 199 rows is empty, but it won't be a problem in our data. I will join both ETHUSD and Algo data.
# join ethusd price data and algo data
ethusd_data = ethusd_data.join(algo_data)
ethusd_data.info()
The Momentum Band/Channel consist of 5 lines, I will show an interactive visualization and compare it to regular moving average indicator to grasp the concept even further.
# adding 20 ema into ethusd data
ethusd_data['20ema'] = np.round(ethusd_data['close'].ewm(span=20, adjust=False).mean(), decimals=5)
# visualize only small part of the data for scaling purpose
band_data = ethusd_data.iloc[-1200:]
# creating subplots
fig = make_subplots(rows=1, cols=2, subplot_titles=("20 EMA", "Momentum Band"))
# candlestick chart for 20 ema
fig.add_trace(go.Candlestick(
x=band_data['timestamp'],
open=band_data['open'],
high=band_data['high'],
low=band_data['low'],
close=band_data['close'],
name="Candlestick"),
row=1, col=1)
# plotting 20 ema
fig.add_trace(go.Scatter(
x=band_data['timestamp'],
y=band_data['20ema'],
name="20 EMA",
line_color="turquoise"),
row=1, col=1)
# candlestick chart for momentum band
fig.add_trace(go.Candlestick(
x=band_data['timestamp'],
open=band_data['open'],
high=band_data['high'],
low=band_data['low'],
close=band_data['close'],
name="Candlestick"),
row=1, col=2)
# plotting momentum band
fig.add_trace(go.Scatter(
x=band_data['timestamp'],
y=band_data['upper'],
name="Upper Band",
line_color="red"),
row=1, col=2)
fig.add_trace(go.Scatter(
x=band_data['timestamp'],
y=band_data['upper_middle'],
name="Upper Middle Band",
line_color="salmon"),
row=1,col=2)
fig.add_trace(go.Scatter(
x=band_data['timestamp'],
y=band_data['middle'],
name="Middle Band",
line_color="yellow"),
row=1, col=2)
fig.add_trace(go.Scatter(
x=band_data['timestamp'],
y=band_data['lower_middle'],
name="Lower Middle Band",
line_color="yellowgreen"),
row=1, col=2)
fig.add_trace(go.Scatter(
x=band_data['timestamp'],
y=band_data['lower'],
name="Lower Band",
line_color="lightseagreen"),
row=1, col=2)
# axes title and hiding rangeslider
fig.update_xaxes(title="Time", row=1, col=1,rangeslider_visible=False)
fig.update_yaxes(title="Price", row=1, col=1)
fig.update_xaxes(title="Time", row=1, col=2,rangeslider_visible=False)
fig.update_yaxes(title="Price", row=1, col=2)
fig.show()
Unlike an exponential moving average(left chart), we can infer a compression or an expansion on momentum band. I call it compression when upper and lower band getting closer to each other, and expansion when upper and lower band move further away from each other. In order to quantify it, we need to generate range width(%) and then look at the distribution to see if we can seize an opportunity from it.
# engineering range_width feature
ethusd_data['range_width'] = ((ethusd_data['upper']-ethusd_data['lower'])/ethusd_data['lower']) * 100
# range width distribution
x = ethusd_data['range_width']
fig = go.Figure(data=[go.Histogram(x=x, nbinsx=100)])
fig.show()
There are over ten thousand rows with 0 to 1 percent range width, from that distribution alone, we can conclude that there are plenty opportunities to take trades on.
Our concern now is to design and evaluate the algorithm. Unlike discretionary trading, a trading algorithm run with predefined set of rules. When the rules are met, it fires off a signal—to buy or to sell.
The strategy is, to enter the trade this algorithm have three key rules :
Additionally, in any trade, a trader must have an exit strategy, a set of conditions determining when they will exit the position, for either profit or loss. The exit strategy for this algorithm is :
Momentum shift happens when the range shift from compression to expansion.
Flat band is occured when there are consecutive same values for each band, particularly for the outer band.
# removing unecessary column
ethusd_data = ethusd_data.drop(columns=['20ema'])
# flagging consecutive values for flat band
ethusd_data['upper_cons'] = 0
ethusd_data['lower_cons'] = 0
uc = 0
lc = 0
for index, row in ethusd_data[2:].iterrows():
if (ethusd_data.loc[index, 'upper'] == ethusd_data.loc[index-1, 'upper']):
uc = uc + 1
ethusd_data.loc[index, 'upper_cons'] = uc
elif (ethusd_data.loc[index, 'upper'] != ethusd_data.loc[index-1, 'upper']):
uc = 0
ethusd_data.loc[index, 'upper_cons'] = uc
if (ethusd_data.loc[index, 'lower'] == ethusd_data.loc[index-1, 'lower']):
lc = lc + 1
ethusd_data.loc[index, 'lower_cons'] = lc
elif (ethusd_data.loc[index, 'lower'] != ethusd_data.loc[index-1, 'lower']):
lc = 0
ethusd_data.loc[index, 'lower_cons'] = lc
ethusd_data[['upper_cons', 'lower_cons']].describe()
We now have range width and consecutive same values of band in our hand. Next step is generating signal from our predefined rules.
# create column and set the value of signal to 0 or hold
ethusd_data['signal'] = 0
# set the range parameter to 1.5
range_parameter = 1.5
for index, row in ethusd_data[2:].iterrows():
if (ethusd_data.loc[index, 'upper'] > ethusd_data.loc[index-1, 'upper'] and # checking if band is expanding to upside
ethusd_data.loc[index-1, 'range_width'] < range_parameter and # checking if range width is below 1.5 before momentum shift
ethusd_data.loc[index-1, 'upper_cons'] > 0): # checking band flatness before momentum shift
ethusd_data.loc[index, 'signal'] = 1
if (ethusd_data.loc[index, 'lower'] < ethusd_data.loc[index-1, 'lower'] and # checking if band is expanding to downside
ethusd_data.loc[index-1, 'range_width'] < range_parameter and # checking if range width is below 1.5 before momentum shift
ethusd_data.loc[index-1, 'lower_cons'] > 0): #checking band flatness before momentum shift
ethusd_data.loc[index, 'signal'] = -1
ethusd_data['signal'].value_counts()
Signal value 1 is indicating long/buy, 0 is hold and -1 is short/sell.
Once we have the signals, we should evaluate the quality of the strategy first, traders call this evaluation backtesting, which is looking at how profitable the strategy is on historical data. Not only that, we will also do a feature engineering for our model to improve the momentum algorithm. We will iterate through our simulation data and collecting the data needed for our model at the same time.
simulation_data = ethusd_data
def get_trades(simulation_data):
position = 0
stop_loss = -1
# data for our model
side = []
durations = []
open_date = []
close_date = []
open_range = [] # open range is range width at trade entry
close_range = [] # close range is range width at trade exit
returns = [] # returns is the percent difference of price between entry and exit
initial_momentum = [] # initial momentum is open range subtracted by compressed range width
for index, rows in simulation_data.iterrows():
if (position == 0):
if (simulation_data.loc[index, 'signal'] == 1):
position = 1 # long/buy
bp = simulation_data.loc[index+1, 'open'] # buying point
open_index = index
open_date.append(simulation_data.loc[index+1, 'timestamp']) # trade open timestamp
open_range.append(simulation_data.loc[index, 'range_width'])
initial_momentum.append(simulation_data.loc[index, 'range_width']-simulation_data.loc[index-1, 'range_width'])
side.append(position)
elif (simulation_data.loc[index, 'signal'] == -1):
position = -1 # short/sell
sp = simulation_data.loc[index+1, 'open'] # selling point
open_index = index
open_date.append(simulation_data.loc[index+1, 'timestamp'])
open_range.append(simulation_data.loc[index, 'range_width'])
initial_momentum.append(simulation_data.loc[index, 'range_width']-simulation_data.loc[index-1, 'range_width'])
side.append(position)
elif (position == 1): # long/buy trade close loop
if ((simulation_data.loc[index+1, 'low']-bp)/bp*100 <= stop_loss): # stop loss loop
position = 0 # trade close
duration = index - open_index # trade duration
returns.append(stop_loss)
close_date.append(simulation_data.loc[index, 'timestamp']) # trade close timestamp
close_range.append(simulation_data.loc[index, 'range_width'])
durations.append(duration)
elif (simulation_data.loc[index, 'close'] <= simulation_data.loc[index, 'upper_middle']):
position = 0
duration = index - open_index
sp = simulation_data.loc[index, 'close']
pc = ((sp-bp)/bp)*100
returns.append(pc)
close_date.append(simulation_data.loc[index, 'timestamp'])
close_range.append(simulation_data.loc[index, 'range_width'])
durations.append(duration)
elif (position == -1): # short/sell trade close loop
if ((sp-simulation_data.loc[index, 'high'])/sp*100 <= stop_loss): # stop loss loop
position = 0
duration = index - open_index
returns.append(stop_loss)
close_date.append(simulation_data.loc[index, 'timestamp'])
close_range.append(simulation_data.loc[index, 'range_width'])
durations.append(duration)
elif (simulation_data.loc[index, 'close'] >= simulation_data.loc[index, 'lower_middle']):
position = 0
duration = index - open_index
bp = simulation_data.loc[index, 'close']
pc = ((sp-bp)/sp)*100
returns.append(pc)
close_date.append(simulation_data.loc[index, 'timestamp'])
close_range.append(simulation_data.loc[index, 'range_width'])
durations.append(duration)
trades = pd.DataFrame({'open_date':open_date,
'close_date':close_date,
'duration':durations,
'open_range':open_range,
'close_range':close_range,
'initial_momentum':initial_momentum,
'side':side,
'returns':returns})
return trades
trades = get_trades(simulation_data)
# trades data columns and types
trades.info()
# trades data dimensionality
trades.shape
# trades data sample
trades.sample(5)
Now we have 4 features for our model:
but, before we create our model, we should evaluate our strategy without fine tuning it first.
We want to improve our strategy, but without evaluating how profitable we are in the first place, we won't know how much of an improvement we made after we fine tune our strategy. Here are some basic trading metrics that can be useful for us to measure our strategy performance :
def executed_trades(returns):
_executed_trades = len(returns)
return _executed_trades
def winning_trades(returns):
wins = returns[returns > 0]
return len(wins)
def losing_trades(returns):
losses = returns[returns < 0]
return len(losses)
def even_trades(returns):
even = returns[returns == 0]
return len(even)
def win_percent(returns):
wins = returns[returns > 0]
return np.round_(winning_trades(returns) / len(returns) * 100 , decimals=2)
def max_win(returns):
return np.round_(max(returns), decimals=2)
def min_lose(returns):
return min(returns)
def avg_win(returns):
wins = returns[returns > 0]
return np.round_(np.mean(wins), decimals=2)
def avg_lose(returns):
losses = returns[returns < 0]
return np.round_(np.mean(losses), decimals=2)
def win_loss_ratio(returns):
return np.round_(avg_win(returns) / np.abs(avg_lose(returns)), decimals=2)
def get_equity_curve(returns):
equity_curve = (1 + (returns/100)).cumprod(axis=0)
return equity_curve
def get_final_equity(equity_curve):
final_equity = equity_curve.iloc[-1]
return final_equity
def drawdown(equity_curve):
eq_series = pd.Series(equity_curve)
_drawdown = (eq_series / eq_series.cummax() - 1) * 100
return _drawdown
def get_max_drawdown(equity_curve):
abs_drawdown = np.abs(drawdown(equity_curve))
_max_drawdown = np.max(abs_drawdown)
return np.round_(_max_drawdown, decimals=2)
# generating equity curve and drawdown
returns = trades['returns']
trades['equity_curve'] = get_equity_curve(returns)
trades['drawdown'] = drawdown(trades['equity_curve'])
equity_curve = trades['equity_curve']
print('Trades executed : %s' % executed_trades(returns))
print('Percent profitable : %s' % win_percent(returns) + '%')
print('Winning trades : %s' % winning_trades(returns))
print('Losing trades : %s' % losing_trades(returns))
print('Even trades : %s' % even_trades(returns))
print('Largest winning trade : %s' % max_win(returns) + '%')
print('Largest losing trade : %s' % min_lose(returns) + '%')
print('Avg. winning trades : %s' % avg_win(returns) + '%')
print('Avg. losing trades : %s' % avg_lose(returns) + '%')
print('Maximum drawdown : %s' % get_max_drawdown(equity_curve) + '%')
print('Final Equity : %s' % np.round_(get_final_equity(equity_curve), decimals=2))
monthly_returns = trades.groupby(trades['close_date'].dt.strftime('%B'))['returns'].sum().sort_values()
monthly_returns = monthly_returns.to_frame()
cats = ['January', 'February', 'March', 'April', 'May', 'June', "July", 'August', 'September', 'October', 'November', 'December']
monthly_returns = monthly_returns.reindex(cats, axis=0)
sns.heatmap(monthly_returns, cmap='YlGn', linewidths=0.5, annot=True);
fig, ax = plt.subplots(2, 1, figsize = (7, 7), sharex=True)
eq = trades['equity_curve']
dd = trades['drawdown']
ax[0].plot(eq)
ax[0].set_title('Equity Curve')
dd.plot(ax=ax[1], kind='area')
ax[1].set_title('Drawdown (%)')
plt.show()
At the end of our backtest, we made over 600% of our starting equity with maximum drawdown of 9.61%
First, I need to exclude few columns of trades data and leave it only with features and target variable.
# target is returns
# feature is open_range, close_range, initial_momentum and duration
clean_trades = trades[['returns', 'open_range' , 'close_range', 'initial_momentum', 'duration']]
clean_trades.head()
clean_trades.describe()
I like to use correlation matrix and pairplot as both of it is some of the fastest way for me to develop an understanding of all my variables. I can also use correlation matrix as feature selection method.
# correlation matrix visualization
sns.heatmap(clean_trades.corr(), cmap=sns.diverging_palette(220, 15, as_cmap=True), annot=True, cbar=False, square=True);
# pairplot visualization
sns.pairplot(clean_trades);
Correlation matrix shows that duration are highly correlated with returns. Pairs plot above also indicates a positive relationship between duration and returns. We want to affect the value of returns by changing independent variables, luckily, we can easily fine tune the duration of our trades. Let's take a deeper look into duration, returns and its relationship.
# separating wins and losses to take a deeper look into duration and returns
duration = trades['duration']
wins = returns[returns > 0]
losses = returns[returns < 0]
wins_duration = duration[returns > 0]
losses_duration = duration[returns < 0]
# wins trade duration histogram plot
wins_duration.plot(kind='hist');
# losses trade duration histogram plot
losses_duration.plot(kind='hist');
print('Average win trades duration : %s' % wins_duration.mean())
print('Average lose trades duration : %s' % losses_duration.mean())
Histogram plot of both winning and losing trades duration shows its distribution. The frequency of low duration on losing trades is highly present.
# win returns against duration
sns.scatterplot(x=wins_duration, y=wins);
# lose returns against duration
sns.scatterplot(x=losses_duration, y=losses);
We can see positive relationship between winning trades and the duration it takes, but weak to no relationship on losing trades. We will use regression analysis to describe the relationships between winning trades and its duration.
X = wins_duration.values.reshape(-1, 1) # independent variable
y = wins.values.reshape(-1, 1) # dependent variable
# splitting train and test data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
# training the model
model = LinearRegression()
result = model.fit(X_train, y_train)
# predicting y value
y_pred = model.predict(X_test)
# metrics
r2 = metrics.r2_score(y_test, y_pred)
print('Coefficient : %s' % model.coef_)
print('Intercept : %s' % model.intercept_)
print('R-squared : %s' % r2)
# linear regression plot
plt.scatter(X_test, y_test)
plt.plot(X_test, y_pred, color='red')
plt.xlabel('Trade Durations')
plt.ylabel('Win Returns')
plt.title('Linear Regression')
plt.show();
# residual plot
residuals = y_test - y_pred
plt.scatter(x=y_pred, y=residuals)
plt.hlines(y=0, xmin=min(y_pred), xmax=max(y_pred), linestyle='dashed')
plt.ylabel('residuals')
plt.xlabel('prediction')
plt.show();
From our observation, more than 61% of the variance in our data is being explained by the model.
We've seen the observation result from our model. Let's tune and modify the algorithm to make it easier to evaluate. As a trader, we should have our favorable max drawdown in mind, we don't want to let the maximum drawdown went way too far as it's possible that it will affect us psychologically, so we will limit the maximum drawdown to 10% of account.
# modified trades function, adding summary
def get_trades_summary(min_duration, simulation_data):
# context
position = 0
stop_loss = -1
# data for our model
side = []
durations = []
open_date = []
close_date = []
open_range = [] # open range is range width at trade entry
close_range = [] # close range is range width at trade exit
returns = [] # returns is the percent difference of price between entry and exit
initial_momentum = [] # initial momentum is range width when momentum shift happen subtracted by compressed range width
for index, rows in simulation_data.iterrows():
if (position == 0):
if (simulation_data.loc[index, 'signal'] == 1):
position = 1
bp = simulation_data.loc[index+1, 'open']
oi = index
open_date.append(simulation_data.loc[index+1, 'timestamp'])
open_range.append(simulation_data.loc[index, 'range_width'])
initial_momentum.append(simulation_data.loc[index, 'range_width']-simulation_data.loc[index-1, 'range_width'])
side.append(position)
elif (simulation_data.loc[index, 'signal'] == -1):
position = -1
sp = simulation_data.loc[index+1, 'open']
oi = index
open_date.append(simulation_data.loc[index+1, 'timestamp'])
open_range.append(simulation_data.loc[index, 'range_width'])
initial_momentum.append(simulation_data.loc[index, 'range_width']-simulation_data.loc[index-1, 'range_width'])
side.append(position)
elif (position == 1):
if ((simulation_data.loc[index+1, 'low']-bp)/bp*100 <= stop_loss):
position = 0
duration = index - oi
returns.append(stop_loss)
close_date.append(simulation_data.loc[index, 'timestamp'])
close_range.append(simulation_data.loc[index, 'range_width'])
durations.append(duration)
elif (simulation_data.loc[index, 'close'] <= simulation_data.loc[index, 'upper_middle']) and (index-oi > min_duration):
position = 0
duration = index - oi
sp = simulation_data.loc[index, 'close']
pc = ((sp-bp)/bp)*100
returns.append(pc)
close_date.append(simulation_data.loc[index, 'timestamp'])
close_range.append(simulation_data.loc[index, 'range_width'])
durations.append(duration)
elif (position == -1):
if ((sp-simulation_data.loc[index, 'high'])/sp*100 <= stop_loss):
position = 0
duration = index - oi
returns.append(stop_loss)
close_date.append(simulation_data.loc[index, 'timestamp'])
close_range.append(simulation_data.loc[index, 'range_width'])
durations.append(duration)
elif (simulation_data.loc[index, 'close'] >= simulation_data.loc[index, 'lower_middle']) and (index-oi > min_duration):
position = 0
duration = index - oi
bp = simulation_data.loc[index, 'close']
pc = ((sp-bp)/sp)*100
returns.append(pc)
close_date.append(simulation_data.loc[index, 'timestamp'])
close_range.append(simulation_data.loc[index, 'range_width'])
durations.append(duration)
trades = pd.DataFrame({'open_date':open_date,
'close_date':close_date,
'duration':durations,
'open_range':open_range,
'close_range':close_range,
'initial_momentum':initial_momentum,
'side':side,
'returns':returns})
summary = pd.DataFrame({'min_duration': min_duration,
'final_equity': get_final_equity(get_equity_curve(trades['returns'])),
'max_drawdown': get_max_drawdown(get_equity_curve(trades['returns']))}, index = [0])
return trades, summary
# generating improved trades summary
improved_trades = pd.DataFrame()
min_duration = 1 # set minimum trade duration to 1
max_drawdown = 9.61 # set initial max drawdown from our unimproved algorithm
while max_drawdown < 10: # set the max drawdown to 10.2 to get more improved trades data for our final analysis
_, summary = get_trades_summary(min_duration, simulation_data)
improved_trades = improved_trades.append(summary)
max_drawdown = summary['max_drawdown'][0]
min_duration += 1
# resetting index
improved_trades = improved_trades.reset_index(drop=True)
improved_trades.describe()
improved_trades.tail(5)
# final equity histogram plot
improved_trades['final_equity'].plot(kind='hist');
# max drawdown histogram plot
improved_trades['max_drawdown'].plot(kind='hist');
# final equity and max drawdown plot
fig, ax = plt.subplots(2, 1, figsize = (7, 7), sharex=True)
fe = improved_trades['final_equity']
md = improved_trades['max_drawdown']
ax[0].plot(fe)
ax[0].set_title('Equity Curve')
md.plot(ax=ax[1], kind='area')
ax[1].set_title('Maximum Drawdown (%)')
plt.show()
Using 37 bars as minimum duration for our trade has been proven to increase our final equity from 641% to 875% in the span of approximately 2 years, with maximum drawdown of 9.44%
Backtesting is only part of evaluating the efficacy of a trading strategy. We would like to compare it to other available strategies and/or assets in order to determine how well we have done. People in Indonesia usually assume that it's great to own gold as an investment, we will benchmark our momentum algorithm trading for ETHUSD against buy and hold strategy for XAU/IDR, it is a ticker of gold(per oz) against Indonesian rupiah. We will only use 2019 data of both trades.
# loading xauidr data from csv file
xauidr = pd.read_csv('xauidr_2019.csv', parse_dates=True)
xauidr.info()
xauidr.head()
# xauidr data cleaning
xauidr['Tanggal'] = pd.to_datetime(xauidr['Tanggal'])
xauidr.sort_values(by=['Tanggal'], inplace=True, ascending=True)
xauidr.reset_index(drop=True, inplace=True)
xauidr = xauidr.rename(columns={'Tanggal':'date','Terakhir':'close','Tertinggi':'high','Terendah':'low','Pembukaan':'open', 'Perubahan%':'returns'})
# removing %
start, stop, step = 0, -1, 1
xauidr['returns'] = xauidr['returns'].astype(str)
xauidr['returns'] = xauidr['returns'].str.slice(start, stop, step)
# replacing (,) with (.)
xauidr['returns'] = (xauidr['returns'].replace(',', '.', regex=True))
xauidr['returns'] = xauidr['returns'].astype(float)
# generating equity curve column and values
xauidr['equity_curve'] = get_equity_curve(xauidr['returns'])
xauidr.head()
# generating trades with 37 bars minimum duration
trades_37, _ = get_trades_summary(37, simulation_data)
# generating equity curve column and values
trades_37['equity_curve'] = get_equity_curve(trades_37['returns'])
# filtering trades, only trades from 2019 remain
start_date = '2019-01-01'
end_date = '2019-12-31'
mask = (trades_37['open_date'] > start_date) & (trades_37['close_date'] <= end_date)
trades_37_2019 = trades_37.loc[mask]
trades_37_2019 = trades_37_2019.reset_index(drop=True)
trades_37_2019 = trades_37_2019[['close_date', 'equity_curve']]
# plotting xauidr equity curve vs momentum(trades_37_2019)
ax_bench = (xauidr['equity_curve']).plot(label = "xauidr")
ax_bench = (trades_37_2019['equity_curve']).plot(ax = ax_bench, label = "momentum")
ax_bench.legend(ax_bench.get_lines(), [l.get_label() for l in ax_bench.get_lines()], loc = 'best')
ax_bench;
print('XAUIDR buy/hold final equity : %s' % get_final_equity(xauidr['equity_curve']))
print('ETHUSD momentum algorithm final equity : %s' % get_final_equity(trades_37_2019['equity_curve']))
Our momentum algorithm outperform buying and holding gold by more than 200%
We have trade duration as our best feature that we can fine tune to improve our algorithm profitability. Does backtesting predict future performance? Not at all. Backtesting likely overfit, just because backtesting provide high growth doesn't mean that growth will hold in the future. We need to forward test, or make an out of sample test to determine if our strategy is actually robust, but outperforming a benchmark assets and/or strategy by multiple orders is not bad at all!